ci(integration-tests): use github.token for check-run posters#801
Conversation
Follow-up to #799. The dispatch failure handlers and auto-pass steps were posting check-runs with `steps.public-token.outputs.token`, which is itself an App-token-generating step. That created a silent-failure trap: if the App secrets are missing or rotated, the App-token step fails, then the failure handler also fails (no token to authenticate with), and the gate sits green from the earlier `skip-integration-tests-pr` job's synthetic-success check — the exact silent-pass anti-pattern the failure handler exists to prevent. Discovered by exercising the dispatch end-to-end on a draft PR before the App secrets were installed (#800 closed). The canonical adbc-drivers/databricks workflow has the same latent bug — fix not yet upstreamed there. The fix is to use the default workflow `${{ github.token }}` for all check-posting steps. The default token already has `checks: write` because each job declares the permission. `steps.public-token` is no longer referenced anywhere; the generation step is removed to keep the workflow tidy. The App token is still used (correctly) for the actual dispatch call into databricks-driver-test, where cross-repo write access is required. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
The e2e retry tests in retry_test_mixins.py assert exact urllib3 call counts (e.g. `mock_validate_conn.call_count == 6`) while mocking urllib3 globally via `_validate_conn` / `_get_conn`. The shared `TelemetryClientFactory` executor — populated by *prior* tests in the same worker — keeps firing background telemetry pushes that land on the same mocked urllib3 layer, inflating the counts and tripping the assertions. Drain the factory and route any new `initialize_telemetry_client` call to `NoopTelemetryClient` for the duration of the three tests that assert exact counts: - test_oserror_retries - test_retry_max_count_not_exceeded - test_retry_exponential_backoff Factory state is fully restored on exit so subsequent tests see the real telemetry pipeline. Co-authored-by: Isaac
Added: fix for pre-existing e2e test flakinessThe 1. Telemetry-executor pollution of retry-test mocks (FIXED in 68f4c60)
The fix adds an 2.
|
Summary
Follow-up to #799 (cross-repo IT dispatch). The dispatch failure handlers and auto-pass steps in `trigger-integration-tests.yml` were posting check-runs with `steps.public-token.outputs.token` — itself an App-token-generating step.
That created a silent-failure trap: if the `INTEGRATION_TEST_APP_ID` / `INTEGRATION_TEST_PRIVATE_KEY` secrets are missing or rotated:
Exactly the silent-pass anti-pattern the failure handler exists to prevent.
How I found it
Exercised the dispatch end-to-end on a draft PR before the App secrets were installed (#800, closed). The dispatch failed as expected, but the failure handler also failed with `Input required and not supplied: github-token`, and `Python Proxy Tests` stayed green from Phase 1's stale synthetic-success check. Full smoke-test write-up is in the closing comment on #800.
The fix
Use the default workflow token `${{ github.token }}` for all check-posting steps. The default token has `checks: write` because each job already declares the permission. The App token is still used (correctly) for the actual `peter-evans/repository-dispatch` call, where cross-repo write access is required.
Since `steps.public-token` is no longer referenced anywhere, the App-token generation step itself is also removed to keep the workflow tidy.
Note on the canonical pattern
The same latent bug exists in adbc-drivers/databricks's trigger-integration-tests.yml — that's where I copied the App-token pattern from. Not fixing upstream here, but worth flagging for the ADBC team.
Diff
Test plan
This pull request and its description were AI-assisted by Isaac.